9 research outputs found
A Novel Sound Reconstruction Technique based on a Spike Code (event) Representation
This thesis focuses on the re-generation of sound from a spike based coding system.
Three different types of spike based coding system have been analyzed. Two of them are biologically inspired spike based coding systems i.e. the spikes are generated in a similar way to how our auditory nerves generate spikes. They have been called AN (Auditory Nerve) spikes and AN Onset (Amplitude Modulated Onset) spikes. Sounds have been re-generated from spikes generated by both of those spike coding technique. A related event based coding technique has been developed by Koickal and the sounds have been re-generated from spikes generated by Koickal's spike coding technique and the results are compared.
Our brain does not reconstruct sound from the spikes received from auditory nerves, it interprets it. But by reconstructing sounds from these spike coding techniques, we will be able to identify which spike based technique is better and more efficient for coding different types of sounds.
Many issues and challenges arise in reconstructing sound from spikes and they are discussed. The AN spike technique generates the most spikes of the techniques tested, followed by Koickal's technique (54.4% lower) and the AN Onset technique (85.6% lower). Both subjective and objective types of testing have been carried out to assess the quality of reconstructed sounds from these three spike coding techniques. Four types of sounds have been used in the subjective test: string, percussion, male voice and female voice. In the objective test, these four types and many other types of sounds have been included. From the results, it has been established that AN spikes generates the best quality of decoded sounds but it produces many more spikes than the others. AN Onset spikes generates better quality of decoded sounds than Koickal's technique for most of sounds except choir type of sounds and noises, however AN Onset spikes produces 68.5% fewer spikes than Koickal's spikes. This provides evidences that AN Onset spikes can outperform Koickal's spikes for most of the sound types
Deep Transfer Learning based COVID-19 Detection in Cough, Breath and Speech using Bottleneck Features
We present an experimental investigation into the automatic detection of
COVID-19 from coughs, breaths and speech as this type of screening is
non-contact, does not require specialist medical expertise or laboratory
facilities and can easily be deployed on inexpensive consumer hardware.
Smartphone recordings of cough, breath and speech from subjects around the
globe are used for classification by seven standard machine learning
classifiers using leave--out cross-validation to provide a promising
baseline performance.
Then, a diverse dataset of 10.29 hours of cough, sneeze, speech and noise
audio recordings are used to pre-train a CNN, LSTM and Resnet50 classifier and
fine tuned the model to enhance the performance even further.
We have also extracted the bottleneck features from these pre-trained models
by removing the final-two layers and used them as an input to the LR, SVM, MLP
and KNN classifiers to detect COVID-19 signature.
The highest AUC of 0.98 was achieved using a transfer learning based Resnet50
architecture on coughs from Coswara dataset.
The highest AUC of 0.94 and 0.92 was achieved from an SVM run on the
bottleneck features extracted from the breaths from Coswara dataset and speech
recordings from ComParE dataset.
We conclude that among all vocal audio, coughs carry the strongest COVID-19
signature followed by breath and speech and using transfer learning improves
the classifier performance with higher AUC and lower variance across the
cross-validation folds.
Although these signatures are not perceivable by human ear, machine learning
based COVID-19 detection is possible from vocal audio recorded via smartphone
Deep Neural Network based Cough Detection using Bed-mounted Accelerometer Measurements
We have performed cough detection based on measurements from an accelerometer
attached to the patient's bed. This form of monitoring is less intrusive than
body-attached accelerometer sensors, and sidesteps privacy concerns encountered
when using audio for cough detection. For our experiments, we have compiled a
manually-annotated dataset containing the acceleration signals of approximately
6000 cough and 68000 non-cough events from 14 adult male patients in a
tuberculosis clinic. As classifiers, we have considered convolutional neural
networks (CNN), long-short-term-memory (LSTM) networks, and a residual neural
network (Resnet50). We find that all classifiers are able to distinguish
between the acceleration signals due to coughing and those due to other
activities including sneezing, throat-clearing and movement in the bed with
high accuracy. The Resnet50 performs the best, achieving an area under the ROC
curve (AUC) exceeding 0.98 in cross-validation experiments. We conclude that
high-accuracy cough monitoring based only on measurements from the
accelerometer in a consumer smartphone is possible. Since the need to gather
audio is avoided and therefore privacy is inherently protected, and since the
accelerometer is attached to the bed and not worn, this form of monitoring may
represent a more convenient and readily accepted method of long-term patient
cough monitoring.Comment: It has been accepted in ICASSP, 2021. Copyright information is shown
at the very first pag
COVID-19 Cough Classification using Machine Learning and Global Smartphone Recordings
We present a machine learning based COVID-19 cough classifier which can
discriminate COVID-19 positive coughs from both COVID-19 negative and healthy
coughs recorded on a smartphone. This type of screening is non-contact, easy to
apply, and can reduce the workload in testing centres as well as limit
transmission by recommending early self-isolation to those who have a cough
suggestive of COVID-19. The datasets used in this study include subjects from
all six continents and contain both forced and natural coughs, indicating that
the approach is widely applicable. The publicly available Coswara dataset
contains 92 COVID-19 positive and 1079 healthy subjects, while the second
smaller dataset was collected mostly in South Africa and contains 18 COVID-19
positive and 26 COVID-19 negative subjects who have undergone a SARS-CoV
laboratory test. Both datasets indicate that COVID-19 positive coughs are
15\%-20\% shorter than non-COVID coughs. Dataset skew was addressed by applying
the synthetic minority oversampling technique (SMOTE). A leave--out
cross-validation scheme was used to train and evaluate seven machine learning
classifiers: LR, KNN, SVM, MLP, CNN, LSTM and Resnet50. Our results show that
although all classifiers were able to identify COVID-19 coughs, the best
performance was exhibited by the Resnet50 classifier, which was best able to
discriminate between the COVID-19 positive and the healthy coughs with an area
under the ROC curve (AUC) of 0.98. An LSTM classifier was best able to
discriminate between the COVID-19 positive and COVID-19 negative coughs, with
an AUC of 0.94 after selecting the best 13 features from a sequential forward
selection (SFS). Since this type of cough audio classification is
cost-effective and easy to deploy, it is potentially a useful and viable means
of non-contact COVID-19 screening.Comment: This paper has been accepted in "Computers in Medicine and Biology"
and currently under productio
Automatic Cough Classification for Tuberculosis Screening in a Real-World Environment
Objective: The automatic discrimination between the coughing sounds produced
by patients with tuberculosis (TB) and those produced by patients with other
lung ailments.
Approach: We present experiments based on a dataset of 1358 forced cough
recordings obtained in a developing-world clinic from 16 patients with
confirmed active pulmonary TB and 35 patients suffering from respiratory
conditions suggestive of TB but confirmed to be TB negative. Using nested
cross-validation, we have trained and evaluated five machine learning
classifiers: logistic regression (LR), support vector machines (SVM), k-nearest
neighbour (KNN), multilayer perceptrons (MLP) and convolutional neural networks
(CNN).
Main Results: Although classification is possible in all cases, the best
performance is achieved using LR. In combination with feature selection by
sequential forward selection (SFS), our best LR system achieves an area under
the ROC curve (AUC) of 0.94 using 23 features selected from a set of 78
high-resolution mel-frequency cepstral coefficients (MFCCs). This system
achieves a sensitivity of 93\% at a specificity of 95\% and thus exceeds the
90\% sensitivity at 70\% specificity specification considered by the World
Health Organisation (WHO) as a minimal requirement for a community-based TB
triage test.
Significance: The automatic classification of cough audio sounds, when
applied to symptomatic patients requiring investigation for TB, can meet the
WHO triage specifications for the identification of patients who should undergo
expensive molecular downstream testing. This makes it a promising and viable
means of low cost, easily deployable frontline screening for TB, which can
benefit especially developing countries with a heavy TB burden.Comment: This paper has been accepted in Physiological Measurement (2021
Coding and Decoding Speech using a Biologically Inspired Coding System
A spike (event) based sound coding technique has been presented in this study where the spikes are similar to the spikes exhibited by type 1 fibers of the auditory nerve. This lossy coding technique has already been shown useful for inter-aural time difference based sound source direction finding. Here, we show that decoding and resynthesising this code can produce intelligible speech even using a small number of spike trains. We have used few composite techniques including speaker verification to assess the effectiveness of the coding technique on a large number of TIMIT sentences. This biologically inspired coding technique can provide suitable input for a spiking neural network, as well as maintaining the accurate time structure of sound
Automatic non-invasive Cough Detection based on Accelerometer and Audio Signals
We present an automatic non-invasive way of detecting cough events based on
both accelerometer and audio signals.
The acceleration signals are captured by a smartphone firmly attached to the
patient's bed, using its integrated accelerometer.
The audio signals are captured simultaneously by the same smartphone using an
external microphone.
We have compiled a manually-annotated dataset containing such
simultaneously-captured acceleration and audio signals for approximately 6000
cough and 68000 non-cough events from 14 adult male patients in a tuberculosis
clinic.
LR, SVM and MLP are evaluated as baseline classifiers and compared with deep
architectures such as CNN, LSTM, and Resnet50 using a leave-one-out
cross-validation scheme.
We find that the studied classifiers can use either acceleration or audio
signals to distinguish between coughing and other activities including
sneezing, throat-clearing, and movement on the bed with high accuracy.
However, in all cases, the deep neural networks outperform the shallow
classifiers by a clear margin and the Resnet50 offers the best performance by
achieving an AUC exceeding 0.98 and 0.99 for acceleration and audio signals
respectively.
While audio-based classification consistently offers a better performance
than acceleration-based classification, we observe that the difference is very
small for the best systems.
Since the acceleration signal requires less processing power, and since the
need to record audio is sidestepped and thus privacy is inherently secured, and
since the recording device is attached to the bed and not worn, an
accelerometer-based highly accurate non-invasive cough detector may represent a
more convenient and readily accepted method in long-term cough monitoring.Comment: arXiv admin note: text overlap with arXiv:2102.0499
Accelerometer-based Bed Occupancy Detection for Automatic, Non-invasive Long-term Cough Monitoring
We present a new machine learning based bed-occupancy detection system that
uses the accelerometer signal captured by a bed-attached consumer smartphone.
Automatic bed-occupancy detection is necessary for automatic long-term cough
monitoring, since the time which the monitored patient occupies the bed is
required to accurately calculate a cough rate. Accelerometer measurements are
more cost effective and less intrusive than alternatives such as video
monitoring or pressure sensors. A 249-hour dataset of manually-labelled
acceleration signals gathered from seven patients undergoing treatment for
tuberculosis (TB) was compiled for experimentation. These signals are
characterised by brief activity bursts interspersed with long periods of little
or no activity, even when the bed is occupied. To process them effectively, we
propose an architecture consisting of three interconnected components. An
occupancy-change detector locates instances at which bed occupancy is likely to
have changed, an occupancy-interval detector classifies periods between
detected occupancy changes and an occupancy-state detector corrects
falsely-identified occupancy changes. Using long short-term memory (LSTM)
networks, this architecture was demonstrated to achieve an AUC of 0.94. When
integrated into a complete cough monitoring system, the daily cough rate of a
patient undergoing TB treatment was determined over a period of 14 days. As the
colony forming unit (CFU) counts decreased and the time to positivity (TPP)
increased, the measured cough rate decreased, indicating effective TB
treatment. This provides a first indication that automatic cough monitoring
based on bed-mounted accelerometer measurements may present a non-invasive,
non-intrusive and cost-effective means of monitoring long-term recovery of TB
patients
Automatic Tuberculosis and COVID-19 cough classification using deep learning
We present a deep learning based automatic cough classifier which can
discriminate tuberculosis (TB) coughs from COVID-19 coughs and healthy coughs.
Both TB and COVID-19 are respiratory disease, have cough as a predominant
symptom and claim thousands of lives each year. The cough audio recordings were
collected at both indoor and outdoor settings and also uploaded using
smartphones from subjects around the globe, thus contain various levels of
noise. This cough data include 1.68 hours of TB coughs, 18.54 minutes of
COVID-19 coughs and 1.69 hours of healthy coughs from 47 TB patients, 229
COVID-19 patients and 1498 healthy patients and were used to train and evaluate
a CNN, LSTM and Resnet50. These three deep architectures were also pre-trained
on 2.14 hours of sneeze, 2.91 hours of speech and 2.79 hours of noise for
improved performance. The class-imbalance in our dataset was addressed by using
SMOTE data balancing technique and using performance metrics such as F1-score
and AUC. Our study shows that the highest F1-scores of 0.9259 and 0.8631 have
been achieved from a pre-trained Resnet50 for two-class (TB vs COVID-19) and
three-class (TB vs COVID-19 vs healthy) cough classification tasks,
respectively. The application of deep transfer learning has improved the
classifiers' performance and makes them more robust as they generalise better
over the cross-validation folds. Their performances exceed the TB triage test
requirements set by the world health organisation (WHO). The features producing
the best performance contain higher order of MFCCs suggesting that the
differences between TB and COVID-19 coughs are not perceivable by the human
ear. This type of cough audio classification is non-contact, cost-effective and
can easily be deployed on a smartphone, thus it can be an excellent tool for
both TB and COVID-19 screening